Discrimination of Word Senses with Hypernyms
نویسندگان
چکیده
Languages are inherently ambiguous. Four out of five words in English have more than one meaning. Nowadays there is a growing number of small proprietary thesauri used for knowledge management for different applications. In order to enable the usage of these thesauri for automatic text annotations, we introduce a robust method for discriminating word senses using hypernyms. The method uses collocations to induce word senses and to discriminate the thesaural sense from the other senses by utilizing hypernym entries taken from a thesaurus. The main novelty of this work is the usage of hypernyms already at the stage sense induction. The hypernyms enable us to cast the task to a binary scenario, namely teasing apart thesaural senses from all the rest. The introduced method outperforms the baseline and has indicates accuracy above 80%.
منابع مشابه
Word Sense Disambiguation by Relative Selection
This paper describes a novel method for a word sense disambiguation that utilizes relatives (i.e. synonyms, hypernyms, meronyms, etc in WordNet) of a target word and raw corpora. The method disambiguates senses of a target word by selecting a relative that most probably occurs in a new sentence including the target word. Only one cooccurrence frequency matrix is utilized to efficiently disambig...
متن کاملDiscriminative Ability of WordNet Senses on the Task of Detecting Lexical Functions in Spanish Verb Noun Collocations
Collocations, or restricted lexical co-occurrence, are a difficult issue in natural language processing because their semantics cannot be derived from the semantics of their constituents. Therefore, such verb-noun combinations as “take a break,” “catch a bus,” “have lunch” can be interpreted incorrectly by automatic semantic analysis. Since collocations are combinations frequently used in texts...
متن کاملAn Iterative Approach to Estimating Frequencies over a Semantic Hierarchy
This paper is concerned with using a semant ic hierarchy to es t imate the frequency with which a word sense appears as a given argument of a verb, assuming the da ta is not sense disambiguated. The s tandard approach is to split the count for any noun appearing in the da t a equally among the alternat ive senses of the noun. This can lead to inaccurate estimates. We describe a rees t imation p...
متن کاملClustering WordNet Senses Utilizing Modified and Novel Similarity Metrics CS 229 Final Project Report
Introduction We approach the problem of clustering senses in Princeton's WordNet (Fellbaum 1998), a manually created dictionary/thesaurus which attempts to model the structure underlying human concepts. A synset, the fundamental unit in WordNet, is represented by a group of synonyms and a gloss definition, and is connected through a variety of semantic links, such as hypernyms (type-of) or mero...
متن کاملLinking Dutch Wikipedia Categories to EuroWordNet
Wikipedia provides category information for a large number of named entities but the category structure of Wikipedia is associative, and not always suitable for linguistic applications. For this reason, a merger of Wikipedia andWordNet has been proposed. In this paper, we address the word sense disambiguation problem that needs to be solved when linking Dutch Wikipedia categories to polysemous ...
متن کامل